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Abstract 

This paper studies the structure of downhnk sum-rate maximizing selective decentrahzed feedback 
policies for opportunistic beamforming under finite feedback constraints on the average number of 
mobile users feeding back. Firstly, it is shown that any sum-rate maximizing selective decentralized 
feedback policy must be a threshold feedback policy. This result holds for all fading channel models 
with continuous distribution functions. Secondly, the resulting optimum threshold selection problem is 
analyzed in detail. This is a non-convex optimization problem over finite dimensional Euclidean spaces. 
By utilizing the theory of majorization, an underlying Schur-concave structure in the sum-rate function 
is identified, and the sufficient conditions for the optimality of homogenous threshold feedback policies 
are obtained. Applications of these results are illustrated for well known fading channel models such 
as Rayleigh, Nakagami and Rician fading channels, along with various engineering and design insights. 
Rather surprisingly, it is shown that using the same threshold value at all mobile users is not always a rate- 
wise optimal feedback strategy, even for a network with identical mobile users experiencing statistically 
the same channel conditions. For the Rayleigh fading channel model, on the other hand, homogenous 
threshold feedback policies are proven to be rate-wise optimal if multiple orthonormal data carrying 
beams are used to communicate with multiple mobile users simultaneously. 
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I. Introduction 
A. Background and Motivation 

A key design challenge in fourth generation (4G) wireless networks is to achieve data rates as high as 
1 Gbit/s for low mobility and 100 Mbit/s for high mobility [1]. Multiple-input multiple-output (MIMO) 
technology marks a paradigm change from scalar communication to the vector one, and has now become 
an integral part of 4G wireless networks to accomplish such high data rate targets. Benefits include power 
gain, diversity gain and degrees-of-freedom gain (or, alternatively called multiplexing gain) [2]-[4], to 
name a few. When there is a multitude of mobile users (MU) in the communication environment, which 
is usually the case in a network setting, random beamforming techniques can also be used to utilize 
multiuser diversity gain [5]. 

Some of these gains can be harvested without requiring any knowledge of wireless channel states such 
as diversity gain, but some others can be exploited effectively only through some form of channel state 
information (CSI) at the base station (BS) [6], [7]. In such communication instances necessitating the use 
of CSI for adaptive signaling, feedback is an important means to convey required information from MUs 
to the BS. This paper studies rate-wise optimal selective feedback policies for vector broadcast channels, 
and establishes the structure of such policies under finite feedback constraints. 

We consider the classical opportunistic communication along multiple orthonormal beams. The focus is 
on the total downlink communication rate', and the BS is provided only with partial CSI (i.e., downlink 
SINR values) for scheduling such as in the IS-856 standard. Hence, the (full CSI) sum-rate capacity 
achieving dirty paper precoding [8]-[12], or any other transmit beamforming strategy requiring full CSI 
to this end, is automatically disallowed. The wireless channels, and therefore the attained signal-to- 
interference -plus-noise ratios (SINR) by different users on different beams, change over time. The BS 
selects the best user (with the highest SINR) per beam to maximize the sum-rate at the downlink. 

This is the opportunistic beamforming (OBF) approach utilizing multiuser diversity and varying channel 
conditions to extract all degrees-of-freedom available for the downlink communication (provided by the 
use of multiple transmit antennas) as well as to deliver improved power gains [5], [13]. Indeed, it 
achieves the same full CSI sum-rate capacity to a first order for large numbers of MUs in the network 
[13]. However, for large numbers of MUs, the OBF approach still requires large amounts of data to be 
fed back, which is an onerous requirement on the uplink feedback channel. What is needed is a selective 

'Unless otherwise stated, we use the term rate (or, its derivatives such as sum-rate, total rate, aggregate communication rate) 
to always refer to the ergodic rate obtained by averaging over many fading states. 
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decentralized feedback policy that will only choose a small subset of MUs to be multiplexed on the 
uplink feedback channel. In this case, the downlink sum-rate is certainly a function of the feedback policy 
selecting MUs. We ask: What is the structure of the sum-rate maximizing selective decentralized feedback 
policies, and how does the resulting sum-rate compare to the sum-rate without any user selection? 

B. Contributions 

Our main findings can be summarized as follows. We first show that any sum-rate maximizing selective 
decentralized feedback policy for a given constraint on the average number of users feeding back must 
be a threshold feedback policy in which each MU, independently from others, decides to feed back or 
not by comparing its SINR values with a predetermined threshold value. Different MUs are allowed 
to have different thresholds if such heterogeneity in thresholds maximizes the total downlink rate. This 
thresholding optimality result does not depend on the particular statistical model of the wireless channel 
as long as the resulting SINR distribution is continuous, which holds for most common fading models 
such as Rayleigh, Rician and Nakagami fading. It also possesses a stability property from a game theoretic 
point of view as explained in Section IV. 

These findings provide an analytical justification for the use of threshold feedback policies in practical 
systems, and strengthen previous work on thresholding as an appropriate selective feedback scheme, e.g., 
see [13]-[18]. They also form a basis for the optimum threshold selection problem analyzed in Section 
V. To some extent, our thresholding optimality result is intuitive and expected. It is even known to hold 
in the limit without feedback constraints for richly scattered Rayleigh fading environments [16], [18]. 
However, its proof in our case is not straightforward, and requires a careful analysis of rate gain and 
loss events due to coupling effects, induced by finite feedback constraints, of MUs' individual feedback 
rules on the sum-rate function. 

Having established the optimality of threshold feedback policies, we now face the optimum threshold 
selection problem to further maximize the downlink sum-rate. This optimization problem is over the 
familiar finite dimensional Euclidean spaces, but it turns out that the objective sum-rate function is not 
necessarily convex as a function of MUs' threshold values. Thus, we resort to the theory of majorization 
[19], and solve the optimum threshold selection problem by identifying an underlying Schur-concave 
structure in the sum-rate function. In particular, we obtain sufficient conditions for the Schur-concavity 
of the sum-rate, and therefore for the rate optimality of homogenous threshold feedback policies in which 
all MUs use the same threshold for their feedback decisions. These conditions are provided for general 
fading models under some mild conditions on the resulting SINR distribution, which are satisfied by 
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most common fading models such as Rayleigh, Rician and Nakagami fading. 

A naive but intuitive approach to maximize the total downlink communication rate for a network with 
identical MUs experiencing statistically the same channel conditions is to use a homogenous threshold 
feedback policy satisfying feedback constraints. Rather surprisingly, our results reveal that this intuition 
does not always work here. We provide a simple counterexample in which only a single beam is used 
for the downlink communication with two MUs in a Rayleigh fading environment. In the high signal- 
to-noise -ratio (SNR) regime, necessary conditions for the Schur-concavity of the sum-rate are violated, 
and it becomes strictly suboptimal to use the same threshold value to mediate MUs' feedback decisions. 
Indeed, we prefer one MU over the other one by assigning a small threshold for this MU to minimize 
the feedback outage event probability, i.e., the probability that none of the MUs feeds back. On the 
other hand, we show that the sum-rate is a Schur-concave function when the SNR is low, and therefore 
the homogenous threshold feedback policy satisfying feedback constraints with equality is the optimum 
policy to maximize the sum-rate in the low SNR regime. To put it in another way, we trade the power gain 
(due to multiuser diversity) for the degrees-of-freedom gain (due to minimum outage communication) in 
the high SNR regime, whereas the degrees-of-freedom gain is traded for the power gain in the low SNR 
regime. An extensive numerical study utilizing our sufficient conditions is also performed to illustrate 
optimality and sub-optimality regions for the homogenous threshold feedback policies for fading models 
other than Rayleigh fading such as Rician and Nakagami fading. 

On the more positive side, we show that the sum-rate is always a Schur-concave function for all 
values of SNR when two or more orthonormal beams are used to simultaneously communicate with 
multiple MUs located in a Rayleigh fading environment. In this case, the downlink communication 
becomes interference limited, rather than noise limited, due to inter-beam interference, and therefore the 
behavior of the optimum threshold feedback policy becomes unchanged for all SNR values: Use the 
same threshold for all MUs such that the feedback constraint is satisfied with equality. For this fading 
scenario, the difference between communication rates achieved with and without user selection is also 
illustrated. In particular, when the threshold values are optimally set for large user populations, there is 
almost no rate loss if the average number of MUs feeding back per beam is around five. From a practical 
point of view, this signifies a significant reduction in the feedback load without noticeable performance 
loss, and provides an important cross-layer design parameter for the higher MAC layer for multiplexing 
MUs on the uplink to feed back. 

The remainder of the paper is organized as follows. In Section II, we compare and contrast our 
results with the relevant previous work. Section III describes the system model, provides basic concepts 
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and definitions to be used in the rest of the paper, and formulates the problem of finding the optimal 
feedback policy maximizing the aggregate communication rate under finite feedback constraints as a 
function optimization problem. In Section IV, we show that this function optimization problem can be 
reduced to a finite dimensional but non-convex optimum threshold selection problem by establishing 
the optimality of threshold feedback policies. In Section V, we solve the optimum threshold selection 
problem by using the theory of majorization. In particular, sufficient conditions for the Schur-concavity 
of the sum-rate function are derived. Section VI presents an extensive numerical and simulation study to 
illustrate the applications of these results to familiar fading models along with various engineering and 
design insights. Section VII concludes the paper. 

II. Related Work 

Feedback load reduction techniques for adaptive signaling in wireless communication networks have 
been a key area of research for more than a decade, especially with the advent of MIMO technology, 
e.g., see [7] and the references therein for an overview of feedback load reduction techniques in wireless 
communication systems. Among many promising approaches proposed over the last decade, OBF (a.k.a., 
opportunistic beamforming) has attracted considerable attention and research effort since its inception 
by Viswanath et al. in [5]. It is a practical way of reducing feedback requirements for vector broadcast 
channels, yet still achieves the full CSI sum-rate capacity at the downlink to a first order [13]. In this 
paper, we are also motivated by such opportunistic communication and beamforming techniques, and 
focus on the downlink sum-rate maximization under finite feedback constraints on the feedback uplink. 

Capacity scaling laws attained by OBF were first obtained by Shariff and Hassibi in [13]. Among many 
other results, they, in particular, showed that if an opportunistic scheduling algorithm is used to harvest 
multiuser diversity gains, the downlink throughput scales optimally like M log log n, where M is the 
number of transmit antennas at the BS, and n is the number of MUs in the system. In [20], the authors 
built upon [13] to derive tighter expressions for the downlink sum-rate scaling for OBF. Unlike these 
works, the results derived for the structure and optimization of the downlink sum-rate in this paper are 
correct for any number (small and large) of MUs in the network. In addition, the sum-rate maximization 
problem addressed in this paper does not appear in [13] and [20]. 

Without any user selection, the number of MUs feeding back grows linearly with the total number of 
MUs in the system to achieve double logarithmic growth in the downlink sum-rate. Threshold feedback 
policies are frequently used to alleviate such an excessive feedback requirement on the uplink [13]-[18]. 
In [14], the authors proposed to use a common threshold level to arbitrate MUs' feedback decisions for 
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scalar channels. They showed that this approach has the potential to significantly reduce the total feedback 
load on the uplink while maintaining almost the same sum-rate performance at the downlink. In [15], the 
authors extended the feedback scheme proposed in [14] by using multiple threshold levels. This paper 
differs from [14] and [15] in three important aspects. Firstly, we provide an analytical justification for why 
threshold feedback policies are right choice for user selection, e.g., see Section IV for details. Secondly, 
we pose an optimum threshold selection problem in which we search for the optimum assignment of 
thresholds to MUs. We show that using the same threshold value for all MUs is not always optimum 
even if all MUs experience statistically the same channel conditions. Finally, our results are given for 
more general vector broadcast channels. 

In [13] and [17], the authors used a constant threshold level, the same for all MUs and independent 
of the number of MUs, to reduce the total feedback load for vector broadcast channels within the OBF 
framework. Such a constant thresholding scheme cannot eliminate the linear growth in the average number 
of MUs feeding back. In [16], it was shown that it is enough to have only O (logn) MUs feeding back 
to achieve the same downlink sum-rate scaling in [13] by varying the common threshold level with the 
total number of MUs in the system. This result was extended in [18] by showing that O {{lognY), 
e G (0, 1), MUs are enough to achieve the same downlink sum-rate scaling in [13]. It is almost as 
if constant feedback load is enough to maintain optimum sum-rate scaling but not exactly. In contrast 
to these previous works, we focus on more stringent but practical constant feedback requirements in 
this paper. The sum-rate maximization framework introduced here does not exist in these papers, either. 
Finally, these previous works only focused on the asymptotic sum-rate scaling behavior, whereas our 
results are correct for any finite number of MUs in the network. 

An important issue associated with OBF is its applicability to finite networks. [21]-[24] propose various 
methods for optimizing OBF for smaller sets of MUs. [21] and [22] propose algorithms to select a target 
group of MUs, and then to request perfect CSI only from the selected set of MUs to facilitate more efficient 
beamforming schemes. [23] and [24] show how feedback aggregation and multiple beamforming vectors 
can be utilized to fine-tune OBF, respectively. Similar to these works, we also focus on finite networks 
in this paper. However, our problem set-up and motivation are much different than those in [21]-[24]. 
Here, we are interested in the structure of feedback policies maximizing the downlink sum-rate given the 
constraints on the average number of MUs to be multiplexed on the uplink for feedback. 

Fairness is also among the important topics for OBF. Proportionally fair algorithm proposed in [5] 
ensures long-term fairness among MUs in terms of average data rates achieved. Although indirectly, 
this paper reveals an interesting and somewhat counterintuitive observation in regards to fairness in the 
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OBF framework. Even for a network with statistically identical MUs, we show that it may become more 
favorable to treat MUs unequally, i.e., to prefer one group of MUs over others by allocating the wireless 
channel to them more frequently, to maximize the downlink sum-rate. We also obtain various sufficient 
conditions on wireless channel statistics under which fairness is automatically achieved, i.e., all MUs 
are given equal chances to feed back and to access the channel. A detailed discussion on this account is 
provided in Sections V and VI. 

Related work also includes [25]-[28]. In [25]-[27], CSI parameters were quantized to reduce the 
feedback load for OBF. This approach cannot eliminate the linear feedback load growth alone, but it leads 
to further feedback reductions when combined with a user selection protocol. In this paper, we solve the 
optimum threshold selection problem offline under statistical information about wireless channels. Once 
the thresholds are optimally assigned for user selection, it is an added design choice how to quantize 
SINR parameters, and the resulting performance analysis requires further investigation, which we do not 
address in this paper. 

In [28], the authors focused on exploiting multiuser diversity in a distributed manner for scalar multiple 
access channels by means of thresholds. Their MAC layer consisted of a collision channel model, and 
the thresholds were chosen to be the same for all MUs with identical channel statistics. Although we 
focus on the dual vector broadcast channels in this paper without any attention on the multiple access 
feedback uplink, our results have some ramifications for the MAC problem studied in [28]. First of all, 
our homogenous threshold optimality results imply that using different threshold levels for different MUs 
with identical channel conditions may further improve the data rates reported in [28]. Secondly, they 
provide a cross-layer design parameter for the number of MUs to be multiplexed on the uplink (for 
feedback) without any noticeable performance degradation at the downlink. 

III. System Model and Problem Formulation 

We consider a multi-antenna single cell vector broadcast channel. There are n MUs in the cell. The 
BS has Nt transmit antennas, and each MU is equipped with a single receive antenna. The channel 
gains between the receive antenna of the ith MU and the transmit antennas of the BS are given by 
hi = (/ii,j, • • . , hjsf^^i) , where /i^ j is the channel gain between the kth transmit antenna at the BS and 
the receive antenna at the ith MU. We assume that /i^ j, k = I, . . . ,Nt and i = 1, . . . , n, are independent 
and identically distributed (i.i.d.) random variables. In addition, we assume a quasi-static block fading 
model, in which channel gains are constant through a coherence time interval, and change from one 
coherence period to another independently according to a common fading distribution. For the sake of 
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notational simplicity, we drop the time index here in the channel model, and also later in the representation 
of transmitted and received signals. 

Our signal model is similar to the one in [13]. The BS transmits M, M < Nt, different data streams 
intended for M different MUs. The symbols of the kth stream are represented by Sk- They are chosen 
from the capacity achieving unit power (complex) Gaussian codebooks, and are sent along the directions 

r ti^ 

of M orthonormal beamforming vectors <bk = (fei,fc, • • • , bNt,k) ( ■ These beamforming vectors can 

I ' ' ) k=l 

be either deterministic, or randomly generated and updated periodically. The overall transmitted signal 
from the BS is given by 

M 
S = y/p'^bkSk, (1) 

k=l 

where p is the transmit power per beam. The signal received by the ith MU is equal to 

M 

Yi = ^Y.^^^kSk + Zi, (2) 

fc=i 

where Zi is the unit power (complex) Gaussian background noise. With these normalized parameter 
selections, p also signifies the SNR per beam as in [13]. Let jm,i be the SINR value corresponding to 
the TTith beam at the ith MU. Then, it is given by 

_ \"'t ""l| .o. 

P ^ + 22k=l,k^m\K ^fcl 

Let 7j = (7i,j, . . . , 7M,i) £ '^+ represent the SINR vector at MU i. Beams are statistically identical, 
and the elements of 7^ are identically distributed for all i ^ M with a common marginal distribution 
F, where M = {1, . . . , n}. However, SINR values at a particular MU are dependent random variables, 
i.e., see (3). We will assume that F is continuous, and has the density / with support M_|_, which are 
true for many fading models including Rayleigh, Rician and Nakagami fading. Similar assumptions on 
the fading distribution also exist in [28]. For the ease of notation, if M = 1, we will use 7^ to denote 
the SINR of MU i on this single beam. T = [7^, . . . ,7„] G Mf ""^ is the system-wide M-by-n SINR 
matrix that contains the SINR vectors of all MUs in the system. 

If the BS has perfect knowledge of F, the aggregate communication rate can be maximized by choosing 
the best MU with the highest SINR on each beam. However, this necessitates excessive amount of 
feedback and information exchange between the BS and MUs. Therefore, we focus on the sum-rate 
maximization under finite feedback constraints, where MUs feed back according to a predefined selective 
feedback policy as defined below. 
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Let 7* = maxi<fc<M 7fc,i be the maximum SINR value at MU i, and let b* = argmaxi<jfc<M7fc,i be 
the index of the best beam achieving 7*. Let also M = {I, . . . , M}. Using these notations, we formally 
define a feedback policy as follows. 

Definition 1: A feedback policy T : Mf " ^ {^Ul^}}" is an {JlU{0}}"-valued function JF = 
{J^i,... , J^nV , where Fi : M^^" ^ 9. U{0} is the feedback rule of MU i, Q. is the set of all feedback 
packets and represents the no-feedback state. We call T a general decentralized feedback policy if JTj 
is only a function of 7^ for all i G J\f. We call it a homogenous general decentralized feedback policy 
if it is decentralized and all MUs use the same feedback rule. Finally, we call it a maximum SINR 
decentralized feedback policy, if Fi (7^) is only a function of 7* and h*, and produces a feedback packet 
containing 7* as the sole SINR information on a positive feedback decision, and otherwise produces 0. 

Intuitively, a feedback policy determines whether a MU will feed back or not. Upon a positive feedback 
decision, it generates a feedback packet containing SINR values at selected beams (along with other 
information to be contained in the packet header), and sends it to the BS for central processing. When it 
is clear from the context, we will omit the term "general". We will index system-wide feedback policies by 
superscripts such as T^, and individual feedback rules by subscripts such as Fi. We use the term "policy" 
to refer to system-wide feedback rules, whereas the term "rule" is used to refer to individual feedback 
rules. The definitions given for system-wide feedback policies extend to individual feedback rules in an 
obvious way when possible. We assume that there is no cooperation between different MUs, which is true 
for most practical systems, therefore we can narrow down our study to decentralized feedback policies 
for the system in consideration. 

Furthermore, we will focus our attention on beam symmetric feedback policies since beams are assumed 
to be statistically identical. We formally define beam symmetric policies as follows. 

Definition 2: Let 11 : M^^ h^ M*^ be a permutation mapping, i.e., 11(7) = (77r(i), • • • ,77r(M)) for 
some one-to-one t: : M ^ M. For T € M*^^", let n (r) = [U (71) , . . . , H (7^)]. If It is the set of 
beam indexes selected by Ti (T), and tt (Xj) is the set of beam indexes selected by Ti (11 (F)) for all 
i G M, we say .^ is a beam symmetric feedback policy. 

This symmetry assumption is just for the sake of notational simplicity, and the same techniques can 
be generalized to beam asymmetric policies by allowing different feedback policies for different beams 
at MUs. We let H denote the set of all beam symmetric decentralized feedback policies. When it is clear 
from the context, we will also omit the term "beam symmetric". 

Given a feedback policy ^, we have a random set of MUs Qrn {^ (r)) requesting beam m ^ Ai. When 
Qm (.^(r)) is a non-empty set at a given fading state, the BS selects the MU with the highest SINR in 
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this set to maximize the instantaneous communication rate in the direction of beam m. If Qm {^ (r)) is 
an empty set, we say a feedback outage event occurs at beam m, and zero rate is achieved at this beam.^ 
Then, the downUnk ergodic sum-rate achieved under the feedback policy ^ is given by 



R{J^) = Er[riJ^,T)] = Er 



M 
V] log ( 1 + max 7m,, 



(4) 



_m=l 

where r (^, F) is the instantaneous sum-rate achieved under the feedback policy ^, expectation is taken 

over the random SINR matrices, and the result of the maximum operation is zero when Qm (-^(r)) is an 

empty set. r™ {^, T) and R"^ {^) denote the instantaneous sum-rate and the ergodic sum-rate on beam m, 

respectively. Note that r™ (:F, T) = log (l + maXjgg^(;F(r)) 7m,i), and i?™ (:F) = Er [r™ (:F, T)]. Also, 

the sum-rate achieved on an event A under ^ is written as R [T ^ A) = Er [r (T ^ F) 1_4], and conditioned 

on an event A (or, a random variable), we define the conditional sum-rate as R {T\A) = Ep [r (T ^ F) |^]. 

We will use R {T) as the performance measure of a given feedback policy along the rate dimension. 

Given a feedback policy T, we will use the average number of MUs feeding back per beam A {T) to 

measure the performance of T along the feedback dimension. A {T) can be written as A (T^ = Yll=i Pi^ 

where pi = Pr{Ti (F) selects beam 1} since JF is beam symmetric. We are interested in maximizing 

the ergodic sum-rate under finite feedback constraints, and the resulting rate maximization problem can 

be written as 

maximize R (^) 

^^^ , (5) 

subject to A (:F) < A 

i.e., find the optimal feedback policy maximizing the aggregate communication rate subject to feedback 

constraint A. This optimization problem is over function spaces [29], and the objective function is not 

necessarily convex. Firstly, we will reduce the search for optimal feedback policies to an optimal threshold 

selection problem over finite dimensional Euclidean spaces by proving rate-wise optimality of threshold 

feedback policies. Then, we will make use of an underlying Schur-concave structure in the objective 

function to solve the resulting optimal threshold selection problem. The next section establishes the 

optimality of threshold feedback policies. 

^Note that the BS does not have access to any CSI on the feedback outage event. Without any CSI, reUable communication 
is still possible if we can average over very large time-scales for all MUs. The extra rate term to be added to (4) in this case 
would not affect our analysis in remainder of the paper, and therefore is omitted for simplicity. 
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IV. Optimality of Threshold Feedback Policies 

In this section, we show that the solution of the optimization problem posed in (5) must be a threshold 
feedback policy. We start our analysis by formally defining threshold feedback policies. 

Definition 3: We say T = (7i, . . . ,Tn) is a general threshold feedback policy (GTFP) if, for all 
i G M, there is a threshold tj such that % (7^) generates a feedback packet containing SINR values 
{7fc,j}fc(=x if ^nd only if 7^ j > tj for all A; G Xj C M. We call it a homogenous general threshold 
feedback policy if all MUs use the same threshold r, i.e., Ti = t for all i G J\f. 

We note that a MU can be allocated to multiple beams according to Definition 3. Another class of 
threshold feedback policies are the feedback policies limiting each MU to request only the beam with the 
highest SINR, e.g., see [13], [16]-[18]. We call this class of feedback policies maximum SINR threshold 
feedback policies, and formally define them as follows. 

Definition 4: T^ = (7i, ... ,7^) is a maximum SINR threshold feedback policy (MTFP) if, for all 
i G M , there is a threshold tj such that % (7^) produces a feedback packet requesting beam k and 
containing 7^ j as the sole SINR information if and only \ih\ = k and 7* > tj. 

For a given set of threshold values, it is not hard to see that the GTFP (corresponding to these threshold 
values) always achieves a rate at least as good as the rate achieved by the MTFP (corresponding to the 
same threshold values) because MUs request all the beams with SINR values above their thresholds 
under the GTFP, which includes the best beam with the highest SINR. Since maximum SINR values are 
also fed back by GTFPs, they can be considered more general than MTFPs. Moreover, as shown later 
in Lemma 3, a GTFP reduces to an MTFP if threshold values of all MUs are greater than one. In this 
section, we will first prove that GTFPs form a rate-wise optimal subset of general decentralized feedback 
policies, and then obtain a similar result for MTFPs. 

A. Optimality of General Threshold Feedback Policies 

It is enough to focus only on the first beam since R {T) can be written as 



R {T) = MEr 



log I 1 + max 7i 

iegi{T{T)) ' 



(6) 



under our assumptions in Section III. 

For our proofs, we will define various sets whose elements lie in various spaces including W'^ and 
M^ xn 'pherefore, paying attention to the space in which the elements of a set lie will facilitate exposition 
in the rest of the paper. 
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For a given beam symmetric general decentralized feedback policy T = {!Fi, . . . , ^„) , we let FBi = 
{7j G M;^ : Ti (7^) selects beam 1} for all i G J\f. Given ^, we construct a GTFP T by choosing tj as 
Pr {71 j > Tj} = Pr {7j G FBi} for all i G AA. This construction is feasible since 71 j is assumed to have a 
continuous distribution function. Such a selection of T leads to a fair comparison between ^ and T since 
A(:F) = A(T). We divide FBi into two disjoint sets S[ = {7^ G M^ : 7^ G F^j & 71,^ < n}, and 
5f = {x G Mf : 7i G i^^^ & 71,* > n}. Finally, we let 5f = {7, G Mf : ^ ^ i"^* & 7i,i > t^}. 
We will use these sets to show R (T) > R (:F). 

The proof is simple for a single user single beam communication scenario. For a particular realization 
of the SINR value 71, the same instantaneous rate is achieved by both feedback policies if they result in 
the same feedback decision. On the other hand, the achieved instantaneous rate will be different if only 
one of the policies results in a positive feedback decision. This happens either when 71 G S^, in which 
case only J^ leads to a positive feedback decision, or when 71 G S^^, in which case only T leads to a 
positive feedback decision. The worst case SINR on the event 71 G S^ is greater than the threshold value 
Ti, and the best case SINR achieved by the MU on the event 71 G S^ is less than ri. Therefore, the 
rates achieved by ^ and T can be upper and lower bounded, respectively, to show that R (T) > R (JF). 

The proof for the multiuser scenario hinges on the same principles above but it is not straightforward due 
to coupling effects of individual feedback rules on the aggregate rate expression. Part of the complexity to 
deal with these effects arises from the heterogeneous nature of the feedback rules. For example, consider 
a two-user single beam communication scenario. Let ^ = (.Fi,.F2) be a general decentralized feedback 
policy, and T = (7i,72) be the corresponding general threshold feedback policy as constructed above. 
Consider the event A in which 71 G S^ and 72 G 5|'. On this event, ^ schedules MU 2, whereas T 
schedules MU 1. If T2 > ti, we can envisage cases in which both 72 > 71 and 71 > 72 can happen 
with positive probability on A. For example, we can represent the sets of interest defined earlier on 
the real line in this case (i.e., M = 1), and Figs. 1(a) and 1(b) show example realizations of 71 and 
72 for which r{T,T) < r (JF, F) and r (T, F) > r(jF, F), respectively. Therefore, average sum-rates 
cannot be bound easily to determine which feedback policy achieves higher expected rate on A. The 
same arguments continue to hold for other events, and the problem complexity is further magnified with 
increasing numbers of MUs. To overcome these issues, we will prove a more general result indicating that 
the best strategy for a MU is to always use a threshold feedback policy whatever the feedback policies 
of other MUs are. 
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Userl, , 71 ^-^--> User 1 , , ^-t-> 7i 

f \ t o \ 1 *■ cxD 4 4 I 1 o > oo 

I <--^ i< > ^ > I <--^ |< > < > 

t Tl t 1- t Ti t l- 

CJ^ cR CL qR 

'->2 '-*2 

User 2, 72 , 4> User 2 , 72 , ^t, 

4 1 I 4 I 1 — *■ cx) 4 \ \ * I 1 — *- oo 

t 'r2 -b f t T2 -b t 

(a) 71 G 5f , 72 G 51- and r (T, T) < r (J^, T). (b) 71 G 5f , 72 G S^ and r (T, T) > r (:F, T). 

Fig. 1. A two-user example indicating problem complexity due to heterogeneity and the coupling effects between individual 
feedback policies. 



To this end, we let 

g^^ (:F(r)) = {iej\r -.iy^i kieOi {t{t))} 

for a given ^ = {J^i, . . . ,J^n) ■ Tliat is, G^^ (jF(r)) is the random set of users containing all MUs 
requesting beam 1 under ^, except for the first MU. The superscript — 1 is used to indicate that all MUs 
but MU 1 requesting beam 1 are included in Q^^ (jF(r)). The maximum beam 1 SINR value achieved 
by a MU in this random set is denoted by 7^" (^), i.e.,^\ {T) = vnscyi^^g-ii^/Y)) 7i,i- 

Consider now the decentralized feedback policy ^^ = (7i, ^2, • • • , ^n) ■ That is, we only allow MU 
1 to switch to the threshold feedback rule 7i with the threshold value ri determined as above. Then, for 
almost all realizations of F, we have 7^" {T) = ^\ {^^) = 7i- Therefore, the difference between R {T) 
and R {^^) depends only on the rate achieved by MU 1 under these two feedback policies. 

We are interested in proving R {T) < R {T). A brief sketch of the proof is as follows. We first prove 
that R (:F) < R (:F^). To this end, we let r_i be the SINR matrix containing SINR values of all MUs 
except those of the first MU. We also let i?(^|r_i) = Ep [r{^,V) |r_i] be the conditional average 
sum-rate achieved by T for a given r_i. Then, it is enough to show that R (^^|r_i) > i?(jF|r_i) 
for almost all r_i. This result implies that the sum-rate increases if MU 1 switches to a threshold 
feedback rule regardless of feedback rules of other MUs. Repeating the same steps for other MUs 
i G {2, 3, • • • , n} one-by-one, we end up with the threshold feedback policy T after n steps, and 
conclude that R{T)>R {T). 

Before giving the details of the proof sketched above, we will first perform a preliminary analysis. 
For the rest of this part of the paper, T^ will represent the decentralized feedback policy derived from a 
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given decentralized feedback policy T as above. When we switch from T to T^ , we can identify three 
main types of events: neutral, loss and gain events. On the neutral event, we will continue to achieve the 
same downlink throughput under both feedback policies. On the loss event, we will lose some data rate 
upon switching to T^ from T. Finally, on the gain event, we will gain some data rate upon switching 
to T^ from T. The difference R (^^) — R{^) depends on the average sum-rates lost and gained on 
the loss and gain events. To show that R {^^) — R {^) > 0, we need to characterize these loss and gain 
events precisely. We first formally define these events, and then provide their further characterizations 
suitable for our analysis in Lemmas 1 and 2. 

Definition 5: The loss, gain and neutral events upon switching to ^^ from J^ on beam 1 are defined 
as 

(7) 

and 

yl7V={rGMf^" : r\j^\T)=r\j^,T)Y (9) 

respectively. 

The neutral event is not so much of an interest since both policies achieve the same rate on this 
event. However, loss and gain events require further evaluation, and the next two lemmas provide other 
characterizations for these events. These characterizations will be important when we compare R {^^) 
against R {T). 

Lemma 1: Al is equal to 



Al 


= {r G Mf ^" 


: ri(:Fi,r) <ri(:F,r)}, 


Ag-- 


= {rGMf^" : 


: r^J^\T) >ri(:F,r)} 



A, 



VelRfx" : 7iG5f k 7i<7i,i}. 



Proof: See Appendix A. 
A similar characterization for the gain event on beam 1 is given in the next lemma. 
Lemma 2: Aq is equal to 



A 



G 



>GRfx- : 7iG5f k 7i<7i,i}- 



Proof: See Appendix A. ■ 

These auxiliary results will aid to prove sum-rate optimality of T^ over T^ in Theorem 1. Before 

providing the details of the proof of this theorem, we will again give a sketch of the proof. Al, Aq and 



DRAFT 



15 

An are three disjoint events with total probabiUty mass of one. Therefore, for a feedback poUcy ^, we 
can write R^ (:F|r_i) = R^ (T , Ai\V^x) + R^ (^, ^clr-i) + R^ {T, An\V.i). 

We can write a similar expression for R^ (^^ |r_i) . Comparison of these two expressions term-by-term 
reveals that R^ (^^|r_i) > R^ (^|r_i). Since this inequality holds for almost all r_i, we also have 
R^ {^^) > R^ {^)- Since beams are statistically identical, the total rate is M times the rate achieved 
on beam 1. Therefore, we finally have R {^^) > R {^)- We make this idea formal in the proof of the 
next theorem. 

Theorem 1: Let ^ = {Fi, . . . ,J^n) and ^^ = (7i, .7^2, ■ ■ ■ ,^n) be defined as above. Then, A {^) = 
A (:f1), and R (jF^) > R (:F) for any M > 1. 

Proof: It is enough to prove R^ (jF^|r_i) > R^ (^|r_i) for almost all r_i. By definition, we have 
R^ (:F, An\T^i) = R^ (:F^, Ajv|r_i), and therefore we are only interested in the average sum-rates on 
loss and gain events. 

The following identity follows from the definition of conditional expectation. 

R^ {T, ^L|r-i) = Pr (AL|r_i) Er [r^ (JF, T) \Al, r_i] . 

Lemma 1 implies that whenever Ai^ is correct, MU 1 requests beam 1, and achieves the best SINR on 
beam 1 among all the MUs requesting beam 1. Since 7^^ € S\ on Ai^, 71 1 is less than t\. Therefore, 

R^{T,Al\V-x) < Pr(AL|r„i)log(l + Ti). (10) 

Similarly, we can write 

R' {T, Ag\V.x) = Pr (AG|r_i) Er [r^ {T, V) \Ag, r_i] . 

Lemma 2 implies that MU 1 achieves the best SINR on beam 1 among all the MUs requesting beam 
1 but 7]^ G Sf- on Ag- Therefore, 7^^ ^ FBi, and MU 1 will not request beam 1 under T. Hence, ^ 
schedules beam 1 to the MU with SINR value 7 J', which leads to ^ 

R^{T,Ag\V^i) = Pr(AG|r_i)log(l + 7i^). (11) 

Similar to the above arguments, MU 1 will not request beam 1 under T^ on the event A^ since 
7i G $1 . This means 

R^ {T\Al\T^i) = Pr (^L|r-i) log (1 + 7i). (12) 

'Note that 7^ is a (measurable) function of r_i, and therefore (11) conforms with the measure theoretic definition of the 
conditional expectation. 
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Finally, MU 1 requests beam 1 under T^ on Aq, leading to 

B}{T^,Ag\V^x) >Pr(AG|r_i)log(l + max(ri,73^)). (13) 

By using (10), (11), (12) and (13), we have 

B" (:^^|r_i) - i?i (:^|r_i) > Pr (^dr-i) (log (i + max (n, 7^)) - log (1 + 7^)) 

+ Pr (^L|r-i) (log (1 + 7O - log (1 + Tl)) . 

To conclude the proof, we need to analyze two different cases separately. If 7* > ri, then it directly 
follows that i?! (:F^|r_i) - i?i (:F|r_i) > 0. If 7]^ < n, then we have 

B} (:ri|r_i,7i^ < n) - b} (:^|r_i,73^ < n) 

> (Pr (Acir-i, 7i < n) - Pr (AL|r_i, 7i < Tl)) (log (1 + n) - log (i + 71*)) . 

Observe that Vx{Ag\V-x,^\ < n) = Pr {-fi G 5f} and Pr {^L|r_i, 7]^ < n} < Pr{7i G 5f }. 
Since Pr {71 G 5f } = Pr{7i G 5f }, we have R^ {j^'^\T^i,^^ < n) - R^ (:F|r_i,75' < n) > 0. 
After removing conditioning, this proves that R^ (^^|r_i) > R^ (^|r_i) for almost all r_i, and 
therefore R^ (jFi) > R^ (JF). ■ 

This theorem shows that if a MU starts using a threshold feedback rule, the sum-rate improves regardless 
of the feedback rules of all other users. This leads to the following key finding. 

Theorem 2: For any beam symmetric general decentralized feedback policy ^, there exists a GTFP 
r such that A (:F) = A (T) and R{T)>R {T). 

Proof: For a given T = {J^i, . . . ,!Fn) , let T = (7i, . . . ,Tn) be the GTFP constructed as above. 
Let J^^ = {Tl,..., Tk,Tk+i, ..., Tn)^ for 1 < /c < n- 1. When A; = n, we have J=^ = T. By Theorem 
1, we have R{J^) < R (jF^) < ■■■ < R (:F") = R (T). Since A (:F) = A (jF^) = • • • = A (:F") = 
A (T), the proof is complete. ■ 

B. Optimality of Maximum SINR Threshold Feedback Policies 

In this part, we briefly explain why similar results also hold for MTFPs. The proof techniques are the 
same except for some subtle differences. To start with, under a maximum SINR decentralized feedback 
policy, each MU requests only the beam achieving the maximum SINR if the feedback conditions are 
met, i.e., see Definitions 1 and 4. Hence, the thresholds are set such that Pr{5* = 1 and 7* > tj} = 
Pr{7i G FBi}. The definition of FBi is refined in which MU i requests beam 1 if and only if 6* = 1 
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and 7* satisfies feedback conditions. The definitions of other sets and events of interest require only 
some subtle modifications, too. For example, Al can now be defined as 

AL = [Te M^"" : 71 G S^ & 7i < 7i } , 

where 5f = {71 G R^ : 7^ G FBi & 7* < ri}. The next two theorems provide results analogous to 
the ones stated in Theorems 1 and 2. 

Theorem 3: For a given beam symmetric decentralized maximum SINR policy ^ = {J^i, . . . ,J^n) , 
let ^^ = {Ti,J^2,---,^n) be the maximum SINR threshold feedback policy derived from ^ by 
allowing MU 1 to switch from Ti to 7i, where Ti is a beam symmetric maximum SINR threshold rule 
whose threshold is set as above. Then, A (:F) = A (:F^), and R (:F^) > R (:F) for any M > 1. 

Theorem 4: For any beam symmetric decentralized maximum SINR feedback policy T, there exists 
an MTFP T such that A (:F) = A (T) and R{T)>R {T). 

Since the proofs of these theorems are similar to the proofs above, we skip them to avoid repetition. 
It is important to note that Theorems 2 and 4 hold for any continuous SINR distribution. 

C. Discussion of Results 

In this part, we briefly discuss the results presented above. We start with a comparison between GTFPs 
and MTFPs. The main advantage of GTFPs over MTFPs is the ability of the BS to allocate multiple 
beams to a MU. Therefore, a GTFP policy achieves higher data rates when compared to an MTFP policy 
with the same threshold levels. From a practical point of view, such gains in data rates are expected to be 
minor due to dependencies among beams at a MU, i.e., high 7^,1 implies low 7^ j, V/c / m. Moreover, 
both types of policies achieve the same performance if all threshold values are greater than 1, which is 
formally proved in the next lemma. 

Lemma 3: Let T be an MTFP with thresholds {T'iJ^^j^, and T' be the corresponding GTFP with the 
same threshold levels. Let Mm and AA^ be the sets of MUs requesting beam tti G A^ according to T 
and T', respectively. If tj > 1 for all i G M, then Mm = Mm. 

Proof: See Appendix B. ■ 

Note that the requirement on threshold values for the equality of MTFPs and GTFPs in Lemma 3 is 
only a [dB] requirement, which is practically a quite low SINR value. This implies that both feedback 
policies will actually achieve the same sum-rate in almost all practical communication scenarios. 

On the other hand, from a theoretical point of view, the resulting optimization problem over M!J: lends 
itself more amenable to further analysis if we only focus on GTFPs. More specifically, we can search 
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for the optimal beam symmetric feedback policies within the class of GTFPs without sacrificing from 
optimality thanks to Theorem 2, and with a slight abuse of notation, we can equivalently write (5) as 

maximize R (r) 

subject to Y.i=i Pi' {7i,i >Ti]<\ 

Some further game theoretic insights are as follows. We will only focus on GTFPs but similar 
explanations also hold for MTFPs. Given the same utility function R{Ti, . . . ,Tn) for all MUs, the 
selfish optimization problem faced by MU i is to choose a beam symmetric decentralized feedback rule 
maximizing its utility given other MUs' feedback rules without increasing the feedback level. Theorem 
1 shows that the dominant strategy is to switch from Ti to the corresponding threshold rule 7^. As a 
result, the set of GTFPs constitute the set of Nash equilibria for this feedback rule selection game, and 
therefore GTFPs are also stable operating points from a game theoretic point of view. 

In the rest of the paper, we will analyze the finite dimensional optimization problem in (14). We will 
show that the sum-rate becomes a Schur-concave function of feedback probabilities pi = Pr {71 j > tj} if 
the SINR distribution satisfies some mild conditions. This result establishes the optimality of homogenous 
general threshold feedback policies among the class of beam symmetric general decentralized feedback 
policies. 

V. Optimal Threshold Selection Problem 

The optimization problem in (14), which we call optimal threshold selection problem, is still not easy 
to solve, even for a simple two-user system, due to the non-convex objective function and the non- 
convex constraint set depending on the distribution of SINR values. The complexity of the problem 
further increases with increasing numbers of users due to the dimensionality growth. Therefore, it is 
not possible to solve the optimal threshold selection problem in its full generality for a general n-user 
system. However, we can still search for a structure in the sum-rate function to solve the optimal threshold 
selection problem, which is what we will do in the remainder of this section. 

More specifically, we will search for sufficient conditions to be satisfied by SINR distributions so that 
the sum-rate becomes a Schur-concave function of feedback probabilities. Roughly speaking, a Schur- 
concave function increases when the dispersion among the components of its argument decreases, which 
implies a solution for the optimization problem in (14) is a homogenous threshold feedback policy in 
which thresholds are set according to 
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if the sum-rate is a Schur-concave function. We make this intuitive idea rigorous below. 

The rest of this section is organized as follows. We first provide an overview of our main results in 
the next subsection without any formal proofs. We then introduce some key concepts from the theory 
of majorization in Subsection V-B. Finally, formal proofs are supplied in Subsection V-C and in related 
appendices. 

A. Main Results 

The main results of this section are stated in Theorems 5 and 6. In these theorems, we view the sum- 
rate as a function of feedback probabilities. This approach does not limit the generality of our results 
since SINR probability density function is already assumed to have M_|_ as its support, and therefore there 
is a one-to-one correspondence between feedback threshold values tj and the feedback probabilities pi, 
i.e., Ti = F~^ (1 — Pi) for all i G J\f. As already noted in Section III, this assumption is satisfied for 
many commonly used practical fading models such as Rayleigh, Rician and Nakagami fading. It may 
still be possible to extend similar proof techniques to more general fading distributions; a future research 
direction of interest which we do not pursue in this paper since the analysis is already complicated even 
with this simplifying assumption. Our theorems are as follows. 

Theorem 5: The sum-rate R (p) is a Schur-concave function if 

log(l+7)(A-2g)+ / -^^dx-{X-2q)log(l + F-^{l+q-X))>0 (16) 

JF-Hi+q-X)^ + x 

for all 7 > 0, A G [0, 2] and max {0, A - 1} < g < f . 

Theorem 6: The sum-rate R (p) is Schur-concave if / is bounded at zero, and has the derivative /' 

satisfying 

f (F-\x)) < - ^ ^^"'[""^l (17) 

for all xG [0,1]. 

The proofs of Theorems 5 and 6 require introduction of new notation, and involve several cases to 
analyze separately. We also need some key results from the theory of majorization [19] to prove these 
results. Therefore, we have relegated their proofs to the following subsections and appendixes. We now 
briefly discuss their implications. 

We first note that the sufficient condition for the Schur-concavity of the sum-rate given in (16) is 
stronger than the one given in (17) in the sense that (16) always holds whenever (17) holds, but not vice 
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versa. This is formally established in Subsection V-C. Furthermore, since the first term in (16) is always 
positive, an easier condition to check for the Schur-concavity of the sum-rate function is 

'F-i(l+(?-A) 

for all A G [0, 2] and max{0, A — 1} < g < |. Further, we can bound (18) from below to obtain another 
sufficient condition as 



/ -^^dx-{X-2q)log(l+F-\l + q-X))>0 (18) 

JF-Hl+a-X)^ + x 



(l + q-X) log (1 + F-^(l - q)) -il-q) log (l + F'^ {1 + q - A)) > 0, (19) 

for all A G [0,2] and max{0, A — 1} < q < ^. For a two-user system, (18) is also necessary, i.e., see 
Lemma 10 and discussions therein. 

Although the conditions (18) and (19) are easy to verify numerically, they may not be tractable 
analytically. The integral expression in (18) is hard to evaluate in closed-form. Analytical verification 
of (19) is also difficult due to the presence of conflicting forces working in opposite directions to 
increase/decrease the value of the bound. For example, the pre-log factor of the first term in (19), which is 
l+g — A, is smaller than the pre-log factor of the second term, which is l — q, for max{0, X — 1} < q < ^. 
Conversely, for max{0, A — 1} < g < |, F^^{1 — q) appearing inside the logarithm in the first term is 
greater than F^^{1 + q — X) appearing inside the logarithm in the second term. 

On the other hand, the sufficient condition for the Schur-concavity of the sum-rate function given 
in Theorem 6 turns out to be much easier to deal with analytically although it looks more complex 
than (18) and (19). In particular, it provides an almost complete characterization for the solution of 
optimal threshold selection problem for richly scattered Rayleigh fading environments. More precisely, 
(17) is always satisfied for all values of p for Rayleigh fading channels whenever M > 2. Hence, the 
sum-rate is always a Schur-concave function of feedback probabilities in this case, and is maximized if 
thresholds are chosen according to (15). In Section VI, we provide a detailed discussion for the optimality 
and sub-optimality of homogenous threshold feedback policies for Rayleigh fading channels as well as 
other wireless channel models. Next, we will briefly introduce some key concepts from the theory of 
majorization to be used later in our analysis. 

B. Majorization 



For a vector x in M", we denote its ordered coordinates by xm) > • • • > a;(„). For x and y in R", we 
say X majorizes y and write it as x ^m y if we have Yli=i ^(i) — ^i=i Vii) when k = 1, . . . ,n — 1, 
and Y17=i^{i) ~ Sr=iy(«)- ^ function 99 : M" h^ M is said to be Schur-convex if x ^m y implies 
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V (a;) > If (y), and 99 is Schur-concave if —99 is Schur-convex. Schur-convex/concave functions often 
arise in mathematical analysis and engineering applications [30], [31]. For example, every function that 
is concave (convex) and symmetric is also a Schur-concave (Schur-convex) function. 

A Schur-concave function tends to increase when the components of its argument become more similar. 
We will establish conditions under which the sum-rate becomes a Schur-concave function, which will, in 
turn, imply the optimality of homogenous threshold feedback policies. The following lemma is helpful 
in establishing these conditions. 

Lemma 4: Let 99 be a real-valued function defined on R" , and P = |z G M" : zi > • • • > z„}. Then, 
99 is a Schur-convex function if and only if, for all 2; G P and i = l,...,n — 1, 

(/? (zi, . . . , Zi^i.Zi + e, Zj+i - e, 2^+2, • • • , ^n) 

is increasing in e over the region < e < min{zj_i — Zj, Zj+i — Zi+2}. ^ 

It can be seen that the coordinates zi and Zj+i are systematically altered by using the parameter e, 
and the constraints on e eliminate any violation in the order. Interested readers are referred to [19] for 
more insights on the theory of majorization. Now, we will see how we can use this theory to identify 
the Schur-concave structure in the objective rate function. 

C. Schur- concavity Analysis for the Sum-rate 

The main objective is to establish sufficient conditions on the SINR distributions for the Schur- 
concavity of the sum-rate function. Again, we focus on the first beam to explain our proof ideas without 
any loss of generality since all beams are statistically identical. We start by analyzing the sum-rate as a 
function of thresholds as given in (14) to establish three important lemmas. Next, we will incorporate the 
feedback constraint into our optimization problem by interpreting the sum-rate as a function of feedback 
probabilities. Using these results, we will finally establish the underlying Schur-concave structure in the 
sum-rate function through the theory of majorization. 

1) Rate as a Function of Thresholds: Consider thresholds in increasing order, i.e., T^^ii) < • • • < 
T-n{i) < 7"7r(i+i) < ' ' ' < ''"7r(n)- Bascd on Lcmma 4, it is enough to consider 

R (^77(4+1) + e> Tirii) - e) = R (t^(„), . . . , r^(i+i) + e, r^(j) - e, . . . , t^[i)) , 
'*At the end points i = l,i = n — 1, the condition is modified accordingly. 
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to identify the underlying Schur-concave structure in the sum-rate function.^ However, analysis of this 
function is still complex. Therefore, we resort to the following divide-and-conquer approach. 

Let 7\A' = {/c G AA : A; / 7r(i) & 7r(i + 1)}. We fix the thresholds and the SINK values of all MUs in 
J\f' .^ Randomness is now associated only with MUs 7r(i) and 7r(i + 1). With a slight abuse of notation, 
we define the truncated SINR on beam m at MU i as 7„j i = 7m, jIi-y >t\- Let ^\r, = max7i k, which 
is the maximum truncated SINR on beam 1 among the MUs in J\f' . The instantaneous rate on beam 1 
as a function of 7i,^(i), 7i,7r(i+i) and 7j^, is 

r^ (7i,7r(i+i),7i,7r(i),7Ar') = log(l + max{7i_^(i+i),7i^^(i),7_;^,}) . (20) 

Therefore, 

^^ (T",r(i+i),T,r(i)l7Jv-') = E [r^ (7i,7r(i+i),7i,7r{i),7jv-') \1m] ■ (21) 

As shown later in the paper, this approach helps us to use the results derived for a two-user system to 
simplify our analysis. Therefore, considering a two-user system first, the rate on beam 1 as a function 
of the thresholds is explicitly given in Lemma 5. 

Lemma 5: The rate on beam 1 of a two-user system is equal to 

R^ (r) = / log(l + x)dF\x) + F (r^(2)) / log(l + x)dF{x). 

•'Tx(2) •'''"t(I) 

Proof: See Appendix C. ■ 

Coming back to the general n-user scenario, it is not tractable to obtain the rate explicitly as we have 
done in the previous lemma. However, we can explicitly write down an expression for R^ {TT^{i+i) , Tn{i)\lj\f')- 
R^ (t^(j+i), T^(j)|7j^/) is parameterized by 7j^,, and its shape depends on the value of 7^,. Three cases of 
interest are 7j^, > r^(j_|_i), 7j^, < r^(j) and rjr(j) < 7^, < T^(i+i)- We will now establish three important 
lemmas for these three cases, which will be useful in interpreting the rate function. The two-user rate 
expression given in Lemma 5 functions as a building block to obtain beam 1 rate expressions in these 
cases. We will start with the case j^, > T^(^i+i). 

Lemma 6: If 7^, > r^(i+i), R^ (r^(i+i),r^(i)|7jv-') i^ gi^en 

^0 (7;^') = pr {a+l,^ < 7>'i7>'} log (1 + 7;^,) + E [log (I + e^i,«) 1/^* , .>^* xm] , 

^We suppress the dependency of i?^ on r^(fe), fc 7^ i, i + 1 here and later in the paper when we focus only on thresholds r^j;) 
and r^(i+i). 

^Fixing random SINR values means conditioning on them in the probabilistic sense. Indeed, it is sufficient just to condition 
on the maximum truncated beam 1 SINR value corresponding to MUs in M' ■ 
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^1 (^7r(i+l), T-^(i)l7J^') Ro (7JV') 



+ rf + ^^ 1 rf ^"^ 



R2 {TTT{i+l),Tn{i)hXf') 
Fig. 2. Beam 1 rate as a function of thresholds for different values of 'y^, 

where ^^^^^^ = max {7i^^(i+i),7i^^(j)}. 

Proof: See Appendix D. ■ 

Note that R^ (r^(i+i), r^(i)|7Xr') depends only on j^j-, but not on r^(i) and r^(i+i) when 7^, > r^(i+i). 
The next lemma provides an analogous expression for R^ (''"7r(i+i))T7r(i)|7jV') when 7j^, < r^(j). 

Lemma 7: If 7^, < r^(i), iJ^ (r^(i+i),r^(i)|7jv-') is given by 

^i(T^(i+i),Mi)l7Ar') = / log(l + ^)c^-^ (a;) + i^ (T-^(i+i)) / log(l + x)dF(x) 

+ log (1 + 7» F (t^(,)) F (r,(,+i)) . 

Proof: See Appendix D. ■ 

Finally, we look at the case where t^u\ < 7^, < T^u^iy 

Lemma 8: If t^(j) < 7;^, < r„(i+i), i?^ (T^(i+i), T-^(i)|7Ar') is given by 

i?2(r^(i+i),T^(i)l7Jv-') = / log(l + x)dF (x) + F (r^(i+i)) / log(l + x)dF(x) 

+ log(l + 7>0^(T-.(m))^(7>')- 

Proof: See Appendix D. ■ 

For the final two cases, we note that R^ {'T-K(i+i)TT-K(i)\l'j\f') depends both on threshold values r^(-j) 

and r^(j+i), and on 7jl^,. The results of these three lemmas have been graphically summarized in Fig. 2. 
If 7j^, = r^(j), Ri and R2 in Lemmas 7 and 8 evaluate to the same expression. Similarly, if 7j^, = 

r^(j_i_i), Rq and i?2 in Lemmas 6 and 8 evaluate to the same expression. This shows that the rate as a 

function of 7j^, is continuous at t^ij\ and r^uj^iy 
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Given the initial threshold values {T^(fc)}i,_i> the first step to discover the Schur-concave structure in 
the sum-rate function is to analyze the behavior of the function 

for e G [O, min {t^(j) — T^(i-i),TT^(i+2) ~ '^7r(j+i)}] by making use of Lemma 4. This is now a scalar 
problem. At this point, it is more useful to interpret the sum-rate as a function of feedback probabilities 
since the feedback constraint in (14) is in terms of these probabilities. This interpretation helps us to 
incorporate the feedback constraints into our optimization problem more easily, as will be shown next. 

2) Rate as a Function of Feedback Probabilities: There is a one-to-one correspondence between 
feedback thresholds r^(-j) and feedback probabilities p^(j) since / has the support M_|_, i.e., T^(^i-^ = 
^"^1 -Pnii))- Hence, we can represent R^ ('r^(i+i), T^(i)|7JV') as R^ (Pu(i),P,r(i+i)l7JV') without any 
ambiguity. With this interpretation, the optimization problem in (14) can be considered as the problem 
of finding optimum feedback probability vector p* = (p|, . . . ,p*) in [0, 1]" subject to the feedback 
constraint J2^=iPi — ^- Indeed, it is easy to see that any feedback policy solving (14) must achieve the 
feedback constraint with equality, i.e., Yli=iPi — ^■ 

Since F is monotone increasing, we have p^(i) > p^(2) > ••• > p^(i) > p^a+i) > ••• > p^(„). 
Focusing on p^(j) and P7r(i+i)' we have the feedback level Aj = j)^(j) + Pn{i+i)^ and other probabilities 
give us natural boundaries on pjr(i) and P7r(i+i) as such ^7^(4+2) ^ Pn{i+i) ^ Pwii) ^ Pn{i-i)- Without 
violating these boundaries, we will vary p^u\ and p^(j+i) by keeping Aj constant. 

Similar to the previous part, we start our analysis by focusing on a two-user system. Given a feedback 
constraint A > 0, we can restrict our search for the optimal feedback probability vector to the plane given 
by Ptt{i) +Ptt{2) = ^- On this plane, we write the rate function R^ (p) as a function of only p^(2) without 
any ambiguity. The communication rate on this plane as a function of p,r(2) is given below. 

Lemma 9: The rate on beam 1 of a two-user system on the plane 

V = \p^ [0,1]^p,(i)+P^(2)=a} 
as a function of ^,^(2) is equal to 

(2) < 2 

Proof: Follows from a direct substitution of r^(2) = F~^ (l — P-n(2)) in Lemma 5. ■ 

F^^ in the expression above represents the functional inverse of F . We give the first derivative of the 
two-user rate in the following lemma. 



/■OO ,-F ^(1-P„(2)) 

«' (Pu(2)) = / log (1 + X) dF\x) + (1 - p,(2)) / log (1 + x) dF{x). 



for max{0, A — 1} < p^(2) < ^ 
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»| 1 t t t t t t ' 

P7r(J+2) ♦ Pn{i+l) h. Pn{i) PTT{i-l) 
Pn(i+1) - min {PTT{i+l) - Pn{i+2)^PTT{i-l) - Pn{i)} 



■* P 



Fig. 3. Ordered feedback probabilities, and the range of q and e. 



Lemma 10: The first derivative of R^ {Ptt{2)) on 



is equal to 



dPn{2) 



F-Hl-p^(2)) p( 



F-^{l+p^^2)-\) 1 + ^ 



dx-{\- 2p,(2)) log (1 + F-^ (1 + p,(2) - A)). 



(22) 



for max{0, A — 1} < p^(2) < f - 

Proof: Follows directly after differentiating the rate expression in Lemma 9. ■ 

We note that Lemma 4 implies the necessity of — ^^^''^ ^ for all ^7^(2) ^ [max {0, A — 1} , ^] and 
A G [0, 2] for the Schur-concavity of the two-user sum-rate. Consider now the n-user scenario. Given the 
initial feedback probabilities {p7r(fc)}i._i' we need to analyze the behavior of the function 

9rM) = R^ {PAi) + e,P7r{i+i) - ^\lh) (23) 

for e G [0, min {p^(j_i) — PTT(i):P-K(i+i) ~ P-n{i+2.)]'\ to discover Schur-concavity of the rate function by 
Lemma 4. We have already discussed how we can vary p^(2) by keeping A constant for the two-user 
case. Analysis of the general n-user scenario is not fundamentally different from the two-user scenario, 
and a similar technique used for the analysis of the two-user rate function can still be applied for the 
general n-user case without violating the boundary conditions on feedback probabilities. That is, we 
introduce an auxiliary variable q G Vi+i, replace P7r(j+i) ~ £ with q and ^^^(i) + ^ with Aj — q, and write 
R^ {Pn{i) + ^,Pn{i+i) - ^ITaT') ^^ ^ function of q, where 

A. 



n 



i+l 



P-7T{i+l) - mill {P7r(i+1) -Pn{i+2)^P-7T{i-l) -Pn{i)} > 
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RoiiP') 




R2{q\lh) 



^-mh) 



Ai/2 



Fig. 4. A pictorial representation for the rate expression in (24) for gmin <^ — F (7X") < -^■ 

Fig. 3 provides a graphical representation for the selection of q. By using Lemma 6, 7 and 8, we have 

R^ {q\iM') = Ro (tJv') \g>i-F{r^,)} + ^1 ('?l7;^') \g>x,-{i-F{T^,))} 



for g e Pj+i. 

Some insights about (24) are as follows. Let gmin = P-K{i+i) — min 



{P-n(i+l) - PTT(i+2)iP-K{i-l) -P7r(j)}. 



and assume 1-F (7;^,) < ^. If 1-F (7_^,) < Qmin, ^^ (^ItjV') is equal to Rq (7_^,) for all q G [gmin, ^] • 
On the other hand, if 1 — F {^j^,) > qmm, R^ [lllXf) first becomes equal to R2 {q\^j^,) and then equal 
to Rq (7j^/) as q changes from gmin to -f. This behavior is graphically depicted in Fig. 4.^ Therefore, the 
rate in this case can be visualized as a concatenation of two functions with a gluing point at 1 — F (j^,). 
Similar explanations can be given for 1 — F (7j^/) > y- 

3) Schur-concavity of the Sum-rate Function: Building upon our analysis above, we will obtain 
sufficient conditions for the Schur-concavity of the sum-rate in this part. We start our analysis by first 
providing a proof for Theorem 5. We restated Theorem 5 below for the sake of completeness. 

Theorem 5: The sum-rate R (p) is a Schur-concave function if 

log(l+7)(A-2g)+ / ' l^dx-{\-2q)\og{l + F-\l+q-\))>Q 

for all 7 > 0, A G [0, 2] and max {0, A - 1} < g < ^. 

^The plot may not be exactly accurate. It is just given to conceptualize the behavior of the rate function. 
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Proof: It is enough to show that R^ {q\i%-i) is a non-decreasing function of g G "Pj+i for all 
i = 1, . . . , n — 1 and 7j^, > based on Lemma 4. To this end, we can write i?i {q\^^,) explicitly as 

Hl-g) JF-^{l+q-\i) 

+ log(l + 7»(l-9)(l + '?-A.). 



«i {<l\lN') = / log(l + x)dF\x) + {l-q) log(l + x)dF(x) 

JF-Hl-q) JF-^il+q-Xi) 



Using Lemma 10, we get 



\og{l+fj^,){\,-2q) 

+ -^-^dx-{X,-2q)log{l + F-\l + q-X,)). (25) 



dq 



/F-i(i+g-AO 
Similarly, we can write i?2 {q\l^') explicitly as 

R2 iq\iM') = / log(l + x)dF\x) + {l-q) log(l + x)dF{x) 



+ \og{l + fj^,) {I -q)F{fj^,). 



Differentiation and integration-by-parts give us 

dR2{q\tM') _ r'^'-'^ Fix) 



dq J;y*^^ 1 + X 



dx> 0. 



Thus, R^ {q\jj^') is a non-decreasing function oi q ^ "Pj+i for all i = 1, ... ,n — 1 and 7j^, > if 
(16) is correct. ■ 

Second, we provide a proof for Theorem 6 based on Theorem 5. The new sufficient condition for 
the Schur-concavity of the sum-rate function is obtained by means of a second order analysis. Although 
complex looking, it turns out to be much easier to deal with analytically as illustrated for Rayleigh fading 
channels in the next section. Again, we restate Theorem 6 below for the sake of completeness. 

Theorem 6: The sum-rate R (p) is Schur-concave if / is bounded at zero, and has the derivative /' 
satisfying /' {F-^x)) < - ff^'^gj for all x G [0, 1]. 

Proof: Let U {q, A) = fF-ili^f_x) l+^^x - (A - 2q) log (l + F'^ [1 + q - A)). Then, it is enough 
to show that U [q, A) > for all A G [0, 2] and max{0, A — 1} < g < ^ by Theorem 5. To this end, it 
is enough to show ^^^^'^^ < for all A G [0, 2] and max{0, A - 1} < g < f since U (f , A) = 0. 

The following lemma simplifies the proof considerably. 

Lemma 11: Let G(x) = log (l + F-i(x)) (l + ^-^(x)) / (F-1(x))-x for x G [0, 1]. If / is bounded 
at zero and /' satisfies /' (^-^(x)) < - {^^p-[fj^ for all x G [0, 1], then G < on [0, 1]. 
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Proof: By taking the first derivative of G{x) witli respect to x, 

dG{x) 



dx 



log (1 + F'\x)) (1 + F-\x)) ^'X2^2} + log (1 + F-\x)) 



/(F-i(x)) 



log(l + F-i(x)) 






<0. 



/(F-i(x)) 

Hence, G(x) is strictly decreasing for x > 0, and achieves its maximum at x = 0. We have lim^^o G{x) 

since f{x) is bounded at 0, which completes the proof. 

Now, consider the partial derivative of U {q, A) with respect to q, which is equal to 

dU{q,X) _ l-q -1 1+q-X 1 

dq ~ 1 + F-i(l - g) 7 (F-i(l - g)) " 1 + F-i (1 + ^ - A) 7 (F-i {l + q- A)) 

A-2(? 1 



1 + F-i (1 + (7 - A) 7 (i^-Ml + 'Z - A)) 
Taking the common denominators gives us 

dU{q,X) 



+ 21og(l + F-7l+g-A)). 



9g 



i^i (q) 91 (q, A) + K2 {q, A) <72 (g, A) , 



where 

91 {q, A) 



log (1 + F-71 + g - A)) (1 + F-\l - q)) f (F-i(l - q)) - (1 



g2iq,X) = log{l + F-^ (l + q- X)) {1+ F-' {1 + q- X)) f {F-' (l+q- X)) - (l-q), 



Ki{q) 
K2{q,X) 



1 



(l + F-i(l-g))/(F-i(l-g)) 
1 



and 



(l + F-i(l + <Z-A))/(F-i(l + g-A))- 
Note that Ki and K2 are always positive. Thus, it is enough to show that gi and §2 are non-positive 
on [max{0, A — 1} , |] for any fixed A € [0, 2]. To this end, gi and (72 on [max{0, A — 1} , |] can be 
upper bounded as 



9i{q,^)<9i{q) 



log (1 + F-' (1 - q)) (1 + F-i (1 - g)) / (F-i (1 - (7)) - (1 - (Z) 



and 



52 (9, A) < 52 (9, A) 

= log (1 +F'Hl + q- A)) (1 + F-71 + g - A)) / (F-i (1 + g _ A)) - (1 + g - A) . 

Now, using Lemma 1 1, we can show that both g^ and gl^ are non-positive functions on [max (0, A — 1) , 
This means ^^^^'^^ < 0, which impUes U {q, A) > for all A G [0, 2] and max {0, A - 1} < ^ < ^. ■ 
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VI. Applications and Discussion 

In this section, we will apply our results derived in Sections IV and V to well known fading channel 
models. We will also discuss intuition behind the resulting performance figures. We start our discussion 
with Rayleigh fading channels, which is one of the most commonly used channel models in the literature, 
e.g., see [32]-[34], and closely approximates measured data rates in densely populated urban areas [35]. 

A. Rayleigh Fading Channels 

Consider the Rayleigh fading channel model in which hk,i,k = l,...,Nt and i = l,...,n, are 
assumed to be i.i.d. with the common distribution CM (0, 1), where CM [fi, o"^) represents the circularly- 
symmetric complex Gaussian distribution with mean ^ and variance o"^. Recall that the background noise 
is the unit power (complex) Gaussian noise, and therefore p is interpreted as the average SNR below. 

For this channel model, the SINR distribution function F and the associated probability density function 
/ can be given as 



and 

p p r 1 

(27) 



e p 



-(x + l) + M-l 
P 



respectively [13]. An important quantity of interest to apply our results in Theorems 5 and 6 is the 
functional inverse, F~^, of F. The next lemma provides an analytical expression for F~^ for Rayleigh 
fading channels. 

Lemma 12: F^^ is equal to 



F-^{x) 






cxp 



-1 + (M - l)pW (^ Cm-1)p ^^ -x)—'\ if M > 2 
-plog(l-a;) ifM = l 



where x G [0, 1] and W is the Lambert W function given by the defining equation W{x) exp(VF(x)) = x 
for x> —-. 

— e 

Proof: See Appendix E. ■ 

To motivate the discussion below, we start by providing two simple numerical examples; first of which 

illustrates a network configuration in which homogenous threshold feedback policies are optimal, whereas 

the second example provides another network configuration in which homogenous threshold feedback 

policies are strictly suboptimal. 
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Consider two MUs located in a Rayleigh fading environment, i.e., all channel (amplitude) gains are 
random with distribution CM {0, 1). M and A are chosen to be M = 1 and A = 0.5 in both examples 
below. We set p to [dB] in the first example, while it is set to 10 [dB] in the second one. Since 
all MUs are identical with identical fading characteristics in this set-up, it is intuitively expected that a 
homogenous threshold feedback policy must be optimal, and solve the rate maximization problem in (5) 
under both network configurations. 

This is indeed correct for the first network configuration as shown in Fig 5(a). The sum-rate is clearly 
maximized at p = (0.25, 0.25) , and therefore the homogenous threshold feedback policy with thresholds 
set as Thomo = (log (4) , log (4)) solves (5). However, this intuition does not always work as illustrated 
by the second example. In this case, the homogenous threshold feedback policy equalizing the feedback 
probabilities of MUs becomes strictly suboptimal, i.e., see Fig. 5(b). This shows that R (p) is not a 
Schur-concave function of feedback probabilities for these selections of model parameters, and hence, 
it is not necessarily maximized at p = (0.25, 0.25) . We note that the selection of parameters in both 
examples is just for elucidatory purposes, and the same arguments continue to hold for other values of 
A. 

This discussion motivates the following question: When are homogenous threshold feedback policies 
optimal for Rayleigh fading channels? The answer is supplied by the following two theorems. 

Theorem 7: For Rayleigh fading environments with M = 1 and p < 1, R{p) is a Schur-concave 
function of feedback probabilities, and therefore the homogenous threshold feedback policy satisfying 
feedback constraints with equality solves (5) when M = 1 and p < 1. 

Theorem 8: For Rayleigh fading environments with M > 1, R{p) is a Schur-concave function of 
feedback probabilities, and therefore the homogenous threshold feedback policy satisfying feedback 
constraints with equality solves (5) when M > 1. 

Proof: See Appendix F. ■ 

Since the proofs are similar, and are based on the sufficient condition established in Theorem 6, we 
skip the proof of Theorem 7 to avoid repetitions. Theorem 7 shows that it is enough to have p smaller 
than or equal to 1 to ensure the optimality of homogenous threshold feedback policies for Rayleigh fading 
environments when only a single beam is used for the downlink communication. Since F in (26) does 
not depend on Nt, the same result continues to hold for A'^t > 1 as long as multiple transmit antennas 
are used to form a single beam as in [5]. 

On the other hand. Theorem 8 provides an extension of Theorem 7 to multiple beams. Theorem 8 
is promising for multiuser MIMO downlink communication in a Rayleigh fading environment because 
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(a) Behavior of the sum-rate as a function of the feedback 
probability p2 of the second MU for the first example. 
(M = 1, A = 0.5 and p = [dB]) 
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(b) Behavior of the sum-rate as a function of the feedback 
probability p2 of the second MU for the second example. 
(M = 1, A = 0.5 and p = 10 [dB]) 



Fig. 5. Numerical examples illustrating the optimality and sub-optimality of homogenous threshold policies for different network 
configurations. 



it shows that homogenous threshold feedback poUcies are always optimal if multiple beams are used 
to communicate with multiple MUs simultaneously. Although the optimality of homogenous threshold 
feedback policies strongly depends on the properties of the underlying fading process modulating received 
signal strengths and the background noise level present in the system for the single beam case, this is 
not true anymore for multiple beams. More intuition is provided on this point later. 

From a theoretical viewpoint, it is surprising to see that a property holding in the setting of a more 
complicated and general MIMO system model does not always hold for single-input systems. From 
a practical viewpoint, MIMO technology is becoming an integral part and a key feature of the next 
generation wireless communication systems. Thus, these results provide analytically justified design 
guidelines to maximize data rates subject to feedback constraints in densely populated urban areas with 
4G communication systems. 

In the second example above, the rate loss due to use of the homogenous feedback policy seems to be 
very minor around 0.01 [nats per channel use], and therefore it can be thought to be negligible for all 
practical purposes. This motivates us to examine the rate difference between homogenous and optimal 
threshold feedback policies for a broad spectrum of the SNR parameter to verify or falsify the validity 
of this conception. To this end, we investigate the optimality gap arising from the use of homogenous 
threshold feedback policies as opposed to choosing thresholds optimally to maximize the sum-rate in 
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Fig. 6. The optimality gap arising from the use of homogenous threshold feedback policies for different values of p when 
M = 1. 



Rayleigh fading environments in Fig. 6. We set n to 2, A to 0.5 and M to 1 in this numerical example. 
Note that homogenous threshold feedback policies are always optimal when M > 1. Hence, there is no 
optimality gap to investigate in this case. For other values of A and n, qualitatively similar observations 
continue to hold. Since we find optimal threshold levels through an exhaustive search, setting n to 2 
limits our search space. 

For small values of p up to [dB], the homogenous threshold feedback policy with threshold levels 
set as Thomo = (plog (4) , plog (4)) is optimum as predicted by Theorem 7. It continues to be optimum 
for a little while up to around 5.7 [dB] SNR values, and after which it becomes strictly suboptimal to use 
the homogenous threshold feedback policy in terms of the achieved downlink sum-rate. Furthermore, as 
channel conditions become better, i.e., large values of p, the optimality gap becomes larger. Practically, 
this observation indicates that the use of homogenous threshold feedback policies may lead to excessive 
rate loss in the high SNR regime for single beam systems when compared to the rate achieved by the 
optimal feedback policy. 

Another important issue to investigate is the amount of feedback reduction that can be achieved by 
setting thresholds optimally. In Fig. 7, we plot the ratio ^,J between the rates achieved with and without 
thresholding as a function of A for different numbers of MUs. In this figure, we set M to 1, and p to 1. 
Again, similar observations continue to hold for other values of M and p. Since p = I, the homogenous 
threshold feedback policy with thresholds set as r* = (plog (^) , . . . , plog (^)) is optimum, i.e., see 
Lemma 12 and Theorem 7. After inspecting the figure, we see that there is almost no rate loss if the 
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Fig. 7. The ratio between the rates achieved with and without thresholding as a function of the average number of users feeding 
back per beam for different numbers of MUs. 



average number of MUs feeding back per beam is around five. We call this critical feedback level Ac, 
which is an important design parameter to be inputed to the higher MAC layer. It is interesting to see that 
the same design parameter applies to all ^,qJ curves that shift to the right only slightly and converge 
pointwise to a limiting curve as the number of MUs in the system increases. 



The reason behind this phenomenon can be explained as follows. No feedback outage event occurs and 
beams are always assigned to the best MUs at each fading state when thresholds are set to zero. On the 
other hand, the feedback outage event probability is strictly positive when thresholds are optimally set 
to meet the feedback constraint A. However, the tails of the distribution of the random number of MUs 
requesting each beam decays to zero exponentially fast, and therefore we are almost always guaranteed 
to have at least one MU demanding each beam whenever A is above the critical feedback level Ac. As a 
result, the feedback outage event probability becomes negligible, and the beams are still assigned to the 
best MUs with very high probability whenever A > Ac. Moreover, the distribution of the random number 
of MUs feeding back converges to a limiting distribution linearly with the total number of MUs in the 
system, which results in the observed pointwise convergence behavior in Fig. 7. Further details about the 



limiting 



R{0) 



curve (as n -^ oo) can be found in [18], where its exact characterization was obtained 



and interpreted as the feedback-capacity tradeoff curve. 

Two possible interpretations about Ac are as follows. Since the BS communicates only with the best 
MU on each beam, an ideal feedback policy in terms of the optimal usage of uplink communication 
resources is the one that only allows the best MU to feed back at each channel fading state. However, 
such a policy requires centralized operation, or coordination among MUs. Thus, when compared with the 
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ideal feedback policy, Ac can be interpreted as the price that we have to pay to achieve almost the same 
performance with the ideal feedback policy due to decentralized operation. Secondly, when compared 
with the all-feedback policy, it represents the amount of feedback reduction that can be achieved without 
any noticeable performance degradation. For example, as opposed to allowing all MUs to feed back, we 
can reduce the total feedback load 30 times and 60 times by setting thresholds optimally when n = 150 
and n = 300, respectively, without any evident performance loss. 

B. Rician and Nakagami Fading Channels 

In this part, we extend our analysis above to other channel models by briefly studying optimality 
and sub-optimality regions for homogenous threshold feedback policies for Nakagami and Rician fading 
channel models. We set M to 1 for simplicity. Otherwise, calculations for the M > 1 case easily gets 
very complicated for these channel models, which hinders the intuitive understanding of the results below. 
In particular, the derivation of the SINR distribution in the general case becomes very complex. 

We start our discussion with Nakagami fading channels. In this case, hk^i,k = 1,. . . ,Nt and i = 
1, . . . , n, are i.i.d. with the common distribution Nakagami (^u, uj), where n and u are shape and spread pa- 
rameters, respectively. Hence, channel power gains are Gamma distributed with distribution Gamma ( ^, - ) , 
where ^ and - are shape and scale parameters of the associated Gamma distribution, respectively. We 
first note that uj is equal to the average channel power gain, and therefore it is set to 1 to be consistent 
with the Rayleigh fading channel model above. Secondly, if X is a random variable with distribution 
Gamma ( /i, M , then aX is distributed according to Gamma ( ^, M , where a is a positive real number. 
Therefore, under this channel model, the SINR^ distribution is equal to Gamma ( ^, - 

In Fig. 8, we illustrate the regions on which homogenous threshold feedback policies are optimal and 
suboptimal for the Nakagami fading channel model. We set n to 2 and A to 0.5 in this figure. The same 
observations continue hold for other parameter selections. The blue region is computed numerically by 
using the sufficient condition for the Schur-concavity of the sum-rate in Theorem 5, whereas the red 
region is obtained by evaluating the sufficient condition in Theorem 6 numerically. As mentioned earlier, 
the sufficient condition in Theorem 5 is stronger than the one in Theorem 6, which is why the red 
region is contained within the blue region in Fig. 8. Note that the Nakagami fading model reduces to the 
Rayleigh fading model, and the red region only covers SNR values less than one when // = 1, which 

'*No inter-beam interference exists in the M = 1 case. Hence, ttie random SINR is tiie same quantity with the random SNR. 
We continue to use the term SINR for this case to avoid any confusion with the average SNR p. 
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The Region on Which Honiogenony 
Threshold Feedback Pohcies are Suboptinial 

The Region on Which the Sufficient 
Condition in Theorem 6 is Satisfied 




The Region on Which the Sufficient 
Condition in Theorem 5 is Satisfied 



1.5 2 2.5 

Shape Parameter fi 



Fig. 8. The regions on which homogenous threshold feedback pohcies are optimal and suboptimal for the Nakagami fading 
channel model. 



is in accordance with our discussion and Theorem 7 above. Surprisingly, our numerical investigation 
shows that homogenous threshold feedback policies are suboptimal outside the blue region in Fig. 8. 
Therefore, we conjecture that the condition provided in Theorem 5 is also necessary for the optimality 
of homogenous threshold feedback policies. 

Secondly, we consider the Rician fading channel model in which the channel amplitude gains are 
Rician distributed with distribution Rician (K,i-*), where P is the total power gain and K (a.k.a., K 
factor) is the ratio between the power in the direct path and the power in the scattered paths. We set 
P to 1 to be consistent with the Rayleigh and Nakagami fading channel models studied above. If X is 
a random variable with distribution Rician (/f, P), then (— ) has a non-central Chi-square distribution 
with two degrees of freedom, and the non-centrality parameter is given by 2K if the scaling coefficient 
a is chosen to be o" = \/ 2(i+k) - ^^ obtain the SINR distribution by scaling this non-central Chi-square 
distribution with pa'^. 

Fig. 9 illustrates the regions on which homogenous threshold feedback policies are optimal and 
suboptimal for the Rician fading channel model. We set n to 2 and A to 0.5 in this figure. Since 
the similar explanations above continue to hold for the Rician fading channel model as well, we do not 
repeat them here again. 
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The Region on Which Homogenous 
Threshold Feedback Pohcies are Suboptimal 



The Region on Which the Sufficient 
Condition in Theorem 6 is Satisfied 




The Region on Which the Sufficient 
^Condition in Theorem 5 is Satisfied 



Fig. 9. The regions on whicli liomogenous tliresliold feedback policies are optimal and suboptimal for the Rician fading channel 
model. 



C. Why Does Sub-optimality Arise? 

In this part, we provide an intuitive explanation for why homogenous threshold feedback policies 
sometimes become suboptimal to use even when MUs experience statistically the same channel conditions. 
Our discussion will focus on the single beam case first. 

Let /3 be the feedback outage event probability, R (rhomo) be the sum-rate achieved by the homogenous 
threshold feedback policy satisfying feedback constraints with equality, and R (r*) be the sum-rate 
achieved by setting thresholds optimally. For simplicity, we let n = 2, but similar explanations continue 
to hold for any n. The sum-rate in this case can be written as 



i?(ri,r2) = (l-/3)E 



log ( 1 +max7jl{^^>^j j 



No Outage 



Two key underlying factors affect this rate expression. The first one is the power gain that can be 
achieved by means of multiuser diversity. This is represented by the maximization operation inside the 
logarithm function above. The more MUs feed back, the more likely the output of this maximization 
operation to be higher. Indeed, the exact asymptotic statistics of the resulting power gain (under various 
channel models) can be obtained by resorting to an order statistics analysis [36]. The second factor is the 
degrees-of-freedom gain represented by the 1 — /3 term. The smaller the feedback outage event probability, 
the higher the degrees-of-freedom gain that we achieve. The choice of thresholds affects both gains, and 
the interplay between them determines how we set thresholds to maximize the downlink sum-rate. 

In Fig. 10, we focus on the Rayleigh fading channel model to provide further details about the interplay 
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Fig. 10. Optimal feedback probability pj of the second MU as a function of p. (A = 0.5) 

between power and degrees-of-freedom gains. In this figure, we set A to 0.5, and plot the optimal feedback 
probability pg of the second MU as a function of p. In the low SNR regime, pg is equal to 0.25, which 
implies the optimality of the homogenous threshold feedback policy equalizing the feedback probabilities 
of both MUs. However, as p increases, we start to prefer one MU over the other one to maximize the 
sum-rate. In this case, for example, we prefer the first MU over the second one by decreasing the feedback 
probability of the second MU to zero, and increasing the feedback probability of the first MU to 0.5 in 
the high SNR regime. 

The main reason behind this behavior is as follows. When the SNR is low, the sum-rate increases 
almost linearly with the power gain. As a result, we tend to choose thresholds equally to maximize the 
power gain, and thereby to maximize the sum-rate, in the low SNR regime although such a threshold 
assignment reduces the degrees-of-freedom gain. In the high SNR regime, on the other hand, the power 
gain can only provide a logarithmic increase in the sum-rate, i.e., the law of diminishing returns. Hence, 
the power gain earned by setting thresholds equally becomes negligible when compared to the loss in 
the degrees-of-freedom gain, and we tend to choose thresholds heterogeneously to maximize the degrees- 
of-freedom gain, and thereby to maximize the sum-rate, in the high SNR regime. A similar behavior 
continues to hold for other channel models, which is what we investigate next. 

In Figs. 11(a) and 11(b), we plot the ratio Jl"^") as a function of p and K, respectively, for the 
Rician fading channel model. We set A to 1 in both figures. The SNR has the same effect on how we set 
thresholds optimally in the Rician case as well. When small, we prefer the power gain over the degrees- 
of-freedom gain, and set thresholds equally to maximize the sum-rate, which is why Jl^^T) ratio is 
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around one for small values of p, and for i^ = and 2. When high, we prefer the degrees-of-freedom 
gain over the power gain, and set thresholds unequally to maximize the sum-rate, which is why nl'°T\ 
ratio converges to 0.75 for high values of p. 

The exact behavior of ]^.^T^ strongly depends on K, too. Roughly speaking, K determines the 
dynamic range of the SINR distribution, and the power gain due to multiuser diversity becomes more 
prominent when the dynamic range of the distribution is large [4]. However, as K increases, the power 
in the direct path increases, which, in turn, nullifies the scattering effects and reduces the dynamic range 
of the SINR distribution, e.g., see Fig. 12 for an illustration. Therefore, regardless of how small the SNR 
is, it may still become suboptimal to use homogenous threshold feedback policies when K is large, as 
illustrated by the curves corresponding to i^ = 10 and 50 in Fig. 11(a). Furthermore, as K increases, the 
channel becomes more deterministic, and we experience almost no power gain due to multiuser diversity 
in the limit. As a result, j^l^T-^ still converges to 0.75 as K grows large, which is illustrated by Fig. 
11(b). 

Finally, we note that the limiting value of ^T.'^T^ (in the high SNR, or high K regime) depends on 
the feedback constraint A. If A < 1, the optimum feedback probability selection converges to pi = A and 
P2 = (or, vice versa) when p or K grows large. Hence, v^T converges to 1 — | for A < 1, which 
is inline with the 0.75 limit to which the curves in both Figs. 11(a) and 11(b) converge. If A > 1, the 
optimum feedback probability selection converges to pi = 1 and p2 = X — I (or, vice versa) when p or 
K grows large. Hence, ^(^^T) converges to A — ^ for A > 1. Let C* (A), < A < 2, be the limiting 
value that ^(^^T) converges as p or i^ grows large. It is not hard to see that the minimum value of 
C* (A) is 0.75, which is achieved when A = 1. Therefore, the maximum optimality loss arising from the 
use of homogenous threshold feedback policies for a two-user single beam system is 25%. 

Up to now, we have only focused on the single beam case to explain why homogenous threshold 
feedback policies may sometimes become suboptimal to use. Based on the arguments above, we provide 
further insights as to why homogenous threshold feedback policies are always optimal to use when M > 1 
for the Rayleigh fading channel model. We first note that, in contrast to the single beam case. Theorem 8 
indicates a potential phase transition phenomenon in the behavior of the sum-rate in which homogenous 
threshold feedback policies suddenly become always optimal to use when we go from the single beam 
case to the multiple beams case. The main reason behind this phenomenon is the inter-beam interference 
when multiple beams are used to communicate with multiple MUs simultaneously. Such a multiuser 
operation makes the network interference limited, rather than being noise limited, when compared to 
the single beam case. More specifically, an increase in SNR implies a corresponding increase in the 
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Fig. 11. The change of the ratio between the sum-rates achieved by homogenous and optimal threshold feedback policies as 
a function of p and K. (A = 1) 
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Fig. 12. Dynamic range of the SINR distribution for the Rician fading channel model for different values of K. {p = 1) 

inter-beam interference experienced by other beams, and the system ends up operating always in the 
low SNR regime effectively when M > 1. Therefore, the low SNR Rayleigh fading behavior kicks in, 
and homogenous threshold feedback policies always become optimal to use. On the other hand, received 
signal powers improve linearly with SNR in the single beam case, which makes homogenous threshold 
feedback policies suboptimal to use in the high SNR regime. 

Although this intuition works for the Rayleigh fading channel model, it is too optimistic to ask for the 
optimality of homogenous threshold feedback policies for other channel models as well when M > 1. As 
our discussion above makes it apparent, the power gain due to multiuser diversity strongly depends on 
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the parameters of the fading process determining the dynamic range of the resulting SINR distribution. 
There is no power gain to benefit from multiuser diversity by giving all MUs equal chances of channel 
access if the SINR distribution becomes increasingly more deterministic. In these instances, it is expected 
that a heterogenous threshold assignment will maximize the sum-rate even if the network is interference 
limited due to multi-beam operation. It is a potential future research interest to investigate the conditions 
on the parameters of the fading process to guarantee the optimality of homogenous threshold feedback 
policies for channel models other than the Rayleigh fading model such as Rician and Nakagami fading 
channels. 

VII. Conclusions 

Opportunistic beamforming is an important communication strategy achieving the full CSI sum-rate 
capacity for vector broadcast channels to a first order by only requiring partial CSI at the BS. Nevertheless, 
it cannot eliminate the linear growth in the feedback load with increasing numbers of MUs in the 
network unless a selective feedback policy is implemented for user selection. In this paper, we have been 
motivated by these considerations to analyze the resulting downlink sum-rate with user selection when 
orthonormal beams are opportunistically allocated to MUs for the downlink communication. In particular, 
we have focused on the structure of optimal selective decentralized feedback policies for opportunistic 
beamforming under finite feedback constraints on the average number of MUs feeding back. The main 
findings are twofold. 

We have shown that threshold feedback policies in which MUs compare their beam SINRs with a 
threshold for their feedback decisions are always optimal to maximize the downlink sum-rate. This class 
of policies was studied in many previous works such as [13]-[18] without any formal justification for 
why they are the right choice for user selection. Our thresholding optimality result provides the formal 
justification, which holds for all fading channel models with continuous distribution functions. 

Having established the optimality of threshold feedback policies, we now face an optimal threshold 
selection problem to maximize the sum-rate. This is a non-convex optimization problem over finite di- 
mensional Euclidean spaces. We solve this problem by identifying an underlying Schur-concave structure 
in the sum-rate when it is viewed as a function of feedback probabilities. Specifically, we have obtained 
sufficient conditions ensuring the Schur-concavity of the sum-rate, and therefore the rate optimality of 
homogenous threshold feedback policies in which all MUs use the same threshold for their feedback 
decisions. These sufficient conditions have been provided for general fading channel models as well. 

Finally, we have performed an extensive numerical and simulation study to illustrate the applications 
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of our results to familiar fading channel models such as Rayleigh, Nakagami and Rician fading chan- 
nels. With some surprise, we have shown that homogenous threshold feedback policies are not always 
optimal to use for general fading channels, even when all MUs experience statistically the same channel 
conditions. In the particular case of Rayleigh fading channels, on the other hand, homogenous threshold 
feedback policies have been proven to be rate-wise optimal if multiple beams are used for the downlink 
communication. We have also studied the optimality and sub-optimality regions for the homogenous 
threshold feedback policies in the Rician and Nakagami case. The detailed insights regarding when and 
why homogenous threshold feedback policies are rate-wise optimal or suboptimal have been provided, 
in conjunction with various other design and engineering perspectives. 

Appendix A 
Loss Event and Gain Event 

A. Proof of Lemma 1 

Set i^ = |r G Mf " : 7^ G 5f & 7i < 71,1 1. We will show Al = Al- For all T with 7^ G 5f , 
MU 1 requests beam 1 under !F, but not under ^^. Therefore, if 7^ G 5f and 7^" < 71 1, the system 
using !F schedules MU 1 for communication along beam 1, and the system using !F^ schedules another 
MU having 7* < 71^1 for communication along beam 1. This means r^ (jF^,r) < r^ (•^, T) for all 
r G Al, implying Al <^ Al. 

Showing Al ^ Al will complete the proof. For all T with 7^" > 71 1, both feedback policies will 
achieve the same throughput by scheduling the MU having 7*. Therefore, we must have 7]^ < 71^1 on 
the loss event. Now, if -ji ^ FBi, MU 1 will not feed back under ^, which implies no potential loss 
on beam 1. Therefore, for all T G Al, we must have -ji G FBi and 7* < 71 1. If 7^^ < 71 1 and 
7^ G S^, MU 1 requests beam 1 under both feedback policies, resulting in a neutral event. This implies 
that ji G Si and 7^^ < 71 1 for all T G Al- Therefore, we also have Al ^ Al, which concludes the 
proof. 

B. Proof of Lemma 2 

The proof is similar to the one given for Lemma 1. Set Aq = < F G M^ ^" : 7^ G Si & 7^ < 71^1 >. 
We first show that Ac ^ Aq. For all T with 7^^ G Sf^ and 7^^ < 71^1, a system using ^^ schedules MU 
1 for communication on beam 1, but a system using T^ schedules the MU with 7^^ < 71 1. Therefore, 

n {T^,V) > n {T, r) if 7^ G cSf and 7* < 71,1, implying Ag C Aq. 
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Next, observe that the neutral event occurs for all T with 7* > 71 1. Therefore, we must have 7^" < 71 1 
on the gain event. If 71,1 < n, MU 1 will not feed back under JF^, and therefore no rate gain is achieved 
by switching to ^^. Therefore, we must have 71 1 > ti on the gain event. If 7^ G S^, MU 1 still feeds 
back under both feedback policies, which again leads to a neutral event. Therefore, for all T G Aq, we 
must have -ji G 5^ and 7^^ < 71 1, which shows that Aq ^ Aq and completes the proof. 

Appendix B 
Proof of Lemma 3 

Mm and M^ are given as 

Mm = {ieM -.b* =m Sz jh';,i > n} 
and 

Mm = {ieM : Jm,i > Ti} . 

Thus, we have Mm ^ Mm- To show the other direction, take any i G Mm, and a beam index r ^ m. 
Then, \hl q^ > \hjqj.\'^ because tj > 1. Therefore, the following holds. 

\lJa P 

^""'^ ~ 1 -U V^ \h^n |2 

p "T l^k=l,k^m I'H yfcl 

As a result, any MU i G Mm achieves its maximum SINR at beam m if Ti > 1. This implies that b* = m 

and i G Mm- 

Appendix C 
Proof of Lemma 5 

Assume T2 > n (i.e., n = t^{i) and T2 = t^(2)) for notational simplicity. Then, for a two-user system, 
the rate on beam 1 as a function of the thresholds is given as 

RHti,T2) = F{ti) \og{l + x)dF{x)+F{T2) \og{l + x)dF{x) 

J T2 J Ti 

+ E [log (1 + max {71,1, 71,2}) l{T,,_,>ri,7i.,>r2}] 

= F{ti) \og{l + x)dF{x)+F{T2) \og{l + x)dF{x) 

+ (1 -F(ti))(1 - F(r2))E[log(l + max {71,1, 71,2}) |7i,i > n,7i,2 > T2] . 
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H(x) 



Let H{x) = Pr {max {71^1, 71^2} < ^l7i,i ^ ''"1)71,2 > T2}, i.e., H{x) is the CDF of max {71^1, 71^2} 
given 71^1 > Ti and 71^2 > ^2- Then, 

' F(a:)-F(rO _ F{x)~F{r,) . . r . 

l^F{T^) i-FiT2) II a; > max|ri,r2j 

if X < max {ti,T2} 

We can write R^ {ti,T2) as 

R\ti,T2) = F{ti) \og{l + x)dF{x) + F{T2) \og{l + x)dF{x) 

J T2 J Ti 

+ (1-F(ri))(l-F(T2)) / log{l + x)dHix), (29) 

and substituting (28) in (29) leads to 

/"OO />r2 

RHri,T2)= log{l + x)dF\x) + F {t2) log{l + x)dF{x), (30) 

for T2> Ti. For ri > T2, we just switch the places of ri and T2 in (30). Hence, the proof is complete. 

Appendix D 
Rate for Different Values of j^, 

A. Proof of Lemma 6 

Let §Vi,i = max{7i^^(i+i),7i^^(j)}. From (21), 

R^ {r^(i+i),Mi)\lh) = log (1 + 7>') Pr {e-+i,i < lh\lh]+^ flog (l + e^i,i) 1/f* , .>;.* \\lh ■ 

(31) 

Let ^ = |^*+i,j < 7JV'} and ^ = \Q+i,i <lh}- Since 7_;^, is larger than r^(j+i), it follows that 
A = B. Thus, we can write Pr <^ ^tfi,j ^ TaT'I^Jv"' f ~ '^'' i ^i+i,j - ^N''\^M' f f"'" "^^^^ ^'"^'^ ^^^^ °n the 
righthand side of (31). For the second term, we have Qj^^ ^ = ^*^-^ ^ since ^*^^ ■ > j^, > rjr(i+i), which 
concludes the proof. 

B. Proof of Lemma 7 

When 7j^, < r^(j), (31) simpUfies to 

= E [log (1 + ei+l,i) l{7i,,(0>r,(.),7i,.(.+i)>^.(.+i)}] + E [log (1 + 7l,^(i+l)) l{7i,,(.+i)>r,(. + i,}] ^ (^^») 
+ E [log (1 + 7l,7r{i)) l{7i,.(,)>r,(.)}] ^ (^i+l)) + log (1 + 7Ar') ^ (Tttw) F {t^H+I)) ■ 

The first three terms on the righthand side is identical to the rate expression for the two-user system 
in Lemma 5. Substituting the result for the two-user case completes the proof. 
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C. Proof of Lemma 8 

For T^(j) < 7j^, < r^(i+i), (31) simplifies to 

R{T^{i+l),T^{i)\lM') = log(l + 7Ar')-^(T-7r(i+i))^(7Ar') 



+ E 



log (1 + 7i,^(,)) l|^^^^^^^^_^j|7;^,] F (r^(,+i)) 

+ E [log (1 + 7l,7r(i+l)) l{7i.,(.+i)>r,(,+i,}] F (7>) 

+ E [log (1 + ^*+i,,) l{^,,,,.,>^j^„^,.,(.^,,>.,,,+,,}l7>'J 

The last three terms on the righthand side can be further simplified as in Lemma 5 for the two-user 
system, which completes the proof. 



Appendix E 
Proof of Lemma 12 

For M = 1, it is easy to get F^'^{x) = — plog (1 — x). For M > 1, we need to find the function 
F'^{x) satisfying 

p I 

X. 



F{F-^{x)) =1 



The following chain of implications hold. 



exp 



:i+F-i(x)) 



U^\\M-^ 



F{F-^{x)) =x 

l-M 

(M-i)p J J = exp 



^ ((1 + F-Hx))exp(l^tj^))^- =exp(-i)(l-.) 



•^ 
•^ 



1 



l+F-^x) ( 1+F-^x) 

(M-l)p ^^P I (M-l)p I ~ (M-l)p 



6XP -^ (1-a; 



F-Hx 



Hx) =-i + iM-i)pw( ';v(M^-;;y (i_^)T^ 



exp 



which completes the proof. 



Appendix F 
Proof of Theorem 8 

By Theorem 7, it is enough to show that /' (F^^(x)) < ~ (i+p-ull) for ^H x G [0, 1]. To this end, 



let 



.._, , {l + F-\x))f'{F'\x)) 



(32) 
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To simplify g{x) further, we first put y = F ^{x) in (32). Then 
l + y 



9{y) = 1 + 



e p 11 



--{y + l) + M-l 



;i + y)Mg-U - (1 + y)Me-^ (^l(y + 1) + M - l) i - M(l + y)*^-i (i(y + 1) + M - l) e-' 



(1 + y) 



2M 



After some further simplifications, we get 



-(y + 1) 1 

g{y) = 1 + f^ ^ - -(y + 1) - M. 

Uy + l)+M-l) P 



p' 



- -I cxp ( '^ 1 1 \ 

Using Lemma 12, we can write y asy = —l+{M—l)pW{x), where W{x) = W I — im-d (^ ~ ^) ^"^'^ ) • 

Hence, g{x) can be given as 

W(x) 

gix) = 1+ - ^\ ' (M -l)W(x)-M 

^^ ' VF(x) + l ^ ' ^ ' 

(M - \)W{xf + (2M - ?>)W(x) + M - 1 



which is always strictly negative for M > 2. This implies /' [F ^(x)) < — (x+F-Hx\\ ^^"^ ^ ^ ^ [^' ^"^ 
when M > 2, which completes the proof. 



W{x) + 1 

(l+F-^{x)) 
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